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Deadlocks occur in concurrent programs as a consequence of cyclic resource acquisition between 
threads. In this paper we present a novel type system that guarantees deadlock freedom for a lan- 
guage with references, unstructured locking primitives, and locks which are implicitly associated 
with references. The proposed type system does not impose a strict lock acquisition order and thus 
increases programming language expressiveness. 



1 Introduction 

Lock-based synchronization may give rise to deadlocks. Two or more threads are deadlocked when each 
of them is waiting for a lock that is acquired by another thread. According to Coffman et al. H, a set of 
threads reaches a deadlocked state when the following conditions hold: 

- Mutual exclusion: Threads claim exclusive control of the locks that they acquire. 

- Hold and wait: Threads already holding locks may request (and wait for) new locks. 

- No preemption: Locks cannot be forcibly removed from threads; they must be released explicitly 
by the thread that acquired them. 

- Circular wait: Two or more threads form a circular chain, where each thread waits for a lock held 
by the next thread in the chain. 

Coffman has identified three strategies that guarantee deadlock-freedom by denying at least one of 
the above conditions before or during program execution: 

- Deadlock prevention: At each point of execution, ensure that at least one of the above conditions 
is not satisfied. Thus, programs that fall into this category are correct by design. 

- Deadlock detection and recovery: A dedicated observer thread determines whether the above con- 
ditions are satisfied and preempts some of the deadlocked threads, releasing (some of) their locks, 
so that the remaining threads can make progress. 

- Deadlock avoidance: Using information that is computed in advance regarding thread resource 
allocation, determine whether granting a lock will bring the program to an unsafe state, i.e., a state 
which can result in deadlock, and only grant locks that lead to safe states. 

Several type systems have been proposed that guarantee deadlock freedom, the majority of which is 
based on the first two strategies. In the deadlock prevention category, one finds type and effect systems 
that guarantee deadlock freedom by statically enforcing a global lock acquisition order that must be 
respected by all threads [|6l[2l [TOj [T2l [13J . In this setting, lock handles are associated with type-level lock 
names via the use of singleton types. Thus, handle lk t is of type lk(r). The same applies to lock handle 
variables. The effect system tracks the order of lock operations on handles or variables and determines 
whether all threads acquire locks in the same order. 
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Using a strict lock acquisition order is a constraint we want to avoid, as it unnecessarily rejects many 
correct programs. It is not hard to come up with an example that shows that imposing a partial order on 
locks is too restrictive. The simplest of such examples can be reduced to program fragments of the form: 

(lock x in ... lock y in . . .) || (lock y in ... lock x in . . .) 

In a few words, there are two parallel threads which acquire two different locks, x and y, in reverse order. 
When trying to find a partial order < on locks for this program, the type system or static analysis tool 
will deduce that x < y must be true, because of the first thread, and that y < x must be true, because of 
the second. Thus, the program will be rejected, both in the system of Flanagan and Abadi which requires 
annotations |5] and in the system of Kobayashi which employs inference iflOl as there is no single lock 
order for both threads. Similar considerations apply to the more recent works of Suenaga |[T2Tl and 
Vasconcelos et al. lfT3l dealing with non lexically-scoped locks. 

Our work follows the third strategy (deadlock avoidance). It is based on an idea put forward recently 
by Boudol, who proposed a type system for deadlock avoidance that is more permissive than existing 
approaches |1]. However, his system is suitable for programs that use exclusively lexically-scoped lock- 
ing primitives. In this paper we present a simple language with functions, mutable references, explicit 
(de-)allocation constructs and unstructured (i.e., non lexically-scoped) locking primitives. Our approach 
ensures deadlock freedom for the proposed language by preserving exact information about the order of 
events, both statically and dynamically. It also forms the basis for a much simpler approach to provid- 
ing deadlock freedom, following a quite different path, that is easier to program and amenable to type 
inference, which has been implemented for C/pthreads 

In the next section, we informally describe Boudol's idea and present an informal overview of our 
type and effect system. In Section [3] we formally define the syntax of our language, its operational 
semantics and the type and effect system. In Section|4]we reason about the soundness of our system and 
the paper ends with a few concluding remarks. 

2 Deadlock Avoidance 

Recently, Boudol developed a type and effect system for deadlock freedom [1], which is based on dead- 
lock avoidance. The effect system calculates for each expression the set of acquired locks and annotates 
lock operations with the "future" lockset. The runtime system utilizes the inserted annotations so that 
each lock operation can only proceed when its "future" lockset is unlocked. The main advantage of 
Boudol's type system is that it allows a larger class of programs to type check and thus increases the 
programming language expressiveness as well as concurrency by allowing arbitrary locking schemes. 

The previous example can be rewritten in Boudol's language as follows, assuming that the only lock 
operations in the two threads are those visible: 

(lock( V | x in ... lock© y in . . .) || (lock{ x ) y in ... locka x in . . .) 

This program is accepted by Boudol's type system which, in general, allows locks to be acquired in any 
order. At runtime, the first lock operation of the first thread must ensure that y has not been acquired 
by the second (or any other) thread, before granting x (and symmetrically for the second thread). The 
second lock operations need not ensure anything special, as the future locksets are empty. 

The main disadvantage of Boudol's work is that locking operations have to be lexically-scoped. In 
fact, as we show here, even if Boudol's language had lock/unlock constructs, instead of lock. . . in. . ., 
his type system is not sufficient to guarantee deadlock freedom. The example program in Figure (Ha) will 
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let f-Ax.Ay.Az. lockj Y ( x; 



x := x+ 1; 
y.— y + x; 



lockj a ) a; 
lock(b) a; 
unlock a; 
lockg b; 
unlock b; 
unlock a 



a :— a + 1; 
a :— a + a; 



lock k) y; 
unlock x; 
lock z; 
unlock z; 
unlock y 



z := z + y; 



b: = 



b + a; 



in f a a b 

(a) before substitution 



(b) after substitution 



Figure 1: An example program, which is well typed before substitution (a) but not after (b). 

help us see why: It updates the values of three shared variables, x, y and z, making sure at each step that 
only the strictly necessary locks are heldQ 

In our naively extended (and broken, as will be shown) version of Boudol's system, the program in 
Figure Ola) will type check. The future lockset annotations of the three locking operations in the body 
of / are {y}, {z} and 0, respectively. (This is easily verified by observing the lock operations between a 
specific lock/unlock pair.) Now, function / is used by instantiating both x and y with the same variable 
a, and instantiating z with a different variable b. The result of this substitution is shown in Figure QIb). 
The first thing to notice is that, if we want this program to work in this case, locks have to be re-entrant. 
This roughly means that if a thread holds some lock, it can try to acquire the same lock again; this will 
immediately succeed, but then the thread will have to release the lock twice, before it is actually released. 

Even with re-entrant locks, however, the program in FigureQJb) does not type check with the present 
annotations. The first lock for a now matches with the last (and not the first) unlock; this means that a 
will remain locked during the whole execution of the program. In the meantime b is locked, so the future 
lockset annotation of the first lock should contain b, but it does not. (The annotation of the second lock 
contains b, but blocking there if lock b is not available does not prevent a possible deadlock; lock a has 
already been acquired.) So, the technical failure of our naively extended language is that the preservation 
lemma breaks. From a more pragmatic point of view, if a thread running in parallel already holds b and, 
before releasing it, is about to acquire a, a deadlock can occur. The naive extension of Boudol's system 
also fails for another reason: it is based on the assumption that calling a function cannot affect the set of 
locks held by a thread. This is obviously not true, if non lexically-scoped locking is to be supported. 

The type and effect system proposed in this paper supports unstructured locking, by preserving more 
information at the effect level. Instead of treating effects as unordered collections of locks, our type 
system precisely tracks effects as an order of lock and unlock operations, without enforcing a strict 
lock-acquisition order. The continuation effect of a term represents the effect of the function code suc- 
ceeding that term. In our approach, lock operations are annotated with a continuation effect. When a 
lock operation is evaluated, the future lockset is calculated by inspecting its continuation effect. The 
lock operation succeeds only when both the lock and the future lockset are available. 

Figure |2]illustrates the same program as in Figured! except that locking operations are now annotated 
with continuation effects. For example, the annotation \y+, x-, z+, Z-, y—] at the first lock operation 
means that in the future (i.e., after this lock operation) y will be acquired, then x will be released, and so 
onH If x and y were different, the runtime system would deduce that between this lock operation on x 

1 To simplify presentation, we assume here that there is one implicit lock per variable, which has the same name. This is 
more or less consistent with our formalization in Section[3] 

2 In the examples of this section, a simplified version of effects is used, to make presentation easier. In the formalism of 
Section[3] the plus and minus signs would be encoded as differences in lock counts, e.g., y+ would be encoded by a y 1 ' (an 
unlocked y) followed in time by ay ■ (a locked y). 
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let f — Ax.Ay.Az. lockfy+^-^+^-y-] x \ x:=x+l; 



lock.{ U +,a-,b+,b-,a-] a\ a.= a + l; 
lock ( „_ fl _j a; a:-a + a; 
unlock a; 



lock [*-, z+,z-,)H y- y:=y + x; 
unlock x; 

lock[,_ y _] z; z := z+y; 

unlock z; 
unlock y 



lock[/,_, ( ,_j b; b :— b + a; 

unlock b; 



unlock a 



in f a a b 

(a) before substitution 



(b) after substitution 



Figure 2: The program of Figure Q] with continuation effect annotations; now well typed in both cases. 

and the corresponding unlock operation, only y is locked, so the future lockset in Boudol's sense would 
be {y}. On the other hand, if x and y are instantiated with the same a, the annotation becomes [a+, a-, 
b+, b-, a-] and the future lockset that is calculated is now the correct {a,b}. In a real implementation, 
there are several optimizations that can be performed (e.g., pre-calculation of effects) but we do not deal 
with them in this paper. 

There are three issues that must be faced, before we can apply this approach to a full programming 
language. First, we need to consider continuation effects in an interprocedural manner: it is possible 
that a lock operation in the body of function / matches with an unlock operation in the body of function 
g after the point where / was called, directly or indirectly. In this case, the future lockset for the lock 
operation may contain locks that are not visible in the body of /. We choose to compute function effects 
intraprocedurally and to annotate each application term with a continuation effect, which represents the 
effect of the code succeeding the application term in the calling function's body. A runtime mechanism 
pushes information about continuation effects on the stack and, if necessary, uses this information to 
correctly calculate future locksets, taking into account the continuation effects of the enclosing contexts. 

Second, we need to support conditional statements. The tricky part here is that, even in a simple 
conditional statement such as 

if c then (lock x; ... unlock x) else (lock y; ... unlock y) 

the two branches have different effects: [x+, x-] and \y+,y—], respectively. A typical type and effect 
system would have to reject this program, but this would be very restrictive in our case. We resolve this 
issue by requiring that the overall effect of both alternatives is the same. This (very roughly) means that, 
after the plus and minus signs cancel each other out, we have equal numbers of plus or minus signs for 
each lock in both alternatives. Furthermore, we assign the combined effect of the two alternatives to 
the conditional statement, thus keeping track of the effect of both branches; in the example above, the 
combined effect is denoted by [x+, x—] ? [y+, y-]. 

The third and most complicated issue that we need to face is support for recursive functions. Again, 
consider a simple recursive function of the form 

fix/. Ax. if c then (... f(y) ...) else ... 

Let us call jf the effect of / and yj, the computed effect for the body of /. It is easy to see that yb 
must contain jf and, if any lock/unlock operations are present in the body of /, y# will be strictly larger 
than jf. Again, a typical type and effect system would require that yb = y/ and reject this function 
definition. We resolve this issue by computing a summary of y# and requiring that the summary is equal 
to yf. In computing the summary, we can make several simplifications that preserve the calculation of 
future locksets for operations residing outside function /. For instance, we are not interested whether a 
lock is acquired and released many times or just once, we are not interested in the exact order in which 
lock/unlock pairs occur, and we can flatten branches. 
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Expression e 



= x | / | (e e$ I (e)W | e := e 



Type 



T 



| deref e | let p, x = ref e in e 

| share e | release e | lock y e 

| unlock e | I P°P y e I l° c / 

| if e then e else <? | true | false 



Calling mode £, 
Capability k 
Effect y 



Location 



r 



:= p | i@n | p@;i 

:= seq(y) | par 

:= n,n | n,n 

::= | y,r* | y,y?y 



ref(r, r) | bool 



Value 



v ::= / | () | loc, | true | false 

y 

f ::= Ax.ea.ST — >t | Ap.f | fix x : T.f 



Function / 



Figure 3: Language syntax. 



3 Formalism 

The syntax of our language is illustrated in Figure [3j where x and p range over term and "region" vari- 
ables, respectively. Similarly to our previous work (7), a region is thought of as a memory unit that 
can be shared between threads and whose contents can be atomically locked. In this paper, we make 
the simplistic assumption that there is a one-to-one correspondence between regions and memory cells 
(locations), but this is of course not necessary. 

The language core comprises of variables (x), constants (the unit value, true and false), functions 
(/), and function application. Functions can be location polymorphic (Ap.f) and location application 
is explicit (e[p]). Monomorphic functions (Ax.e) must be annotated with their type. The application of 
monomorphic functions is annotated with a calling mode (£), which is seq(y) for normal (sequential) 
application and par for parallel application. Notice that sequential application terms are annotated with 
y, the continuation effect as mentioned earlier. The semantics of parallel application is that, once the pa- 
rameters have been evaluated and substituted, the function's body is moved to a new thread of execution 
and the spawning thread can proceed with the remaining computation in parallel with the new thread. 
The term pop r e encloses a function body e and can only appear during evaluation. The same applies 
to constant locations i@n, which cannot exist at the source-level. The construct let p,x = ref e\ in et 
allocates a fresh cell, initializes it to e\, and associates it with variables p and x within expression ^2- As 
in other approaches, we use p as the type-level representation of the new cell's location. The reference 
variable x has the singleton type ref(p,r), where r is the type of the cell's contents. This allows the 
type system to connect x and p and thus to statically track uses of the new cell. As will be explained 
later, the cell can be consumed either by deallocation or by transferring its ownership to another thread. 
Assignment and dereference operators are standard. The value loc, represents a reference to a location 
i and is introduced during evaluation. Source programs cannot contain loc,. 

At any given program point, each cell is associated with a capability (/c). Capabilities consist of two 
natural numbers, the capability counts: the cell reference count, which denotes whether the cell is live, 
and the lock count, which denotes whether the cell has been locked to provide the current thread with ex- 
clusive access to its contents. Capability counts determine the validity of operations on cells. When first 
allocated, a cell starts with capability (1,1), meaning that it is live and locked, which provides exclusive 
access to the thread which allocated it. (This is our equivalent of thread-local data.) Capabilities can be 
either pure (n\,n-i) or impure (n\,n2). In both cases, it is implied that the current thread can decrement 
the cell reference count n\ times and the lock count «2 times. Similarly to fractional permissions 01 . 
impure capabilities denote that a location may be aliased. Our type system requires aliasing information 
so as to determine whether it is safe to pass lock capabilities to new threads. 

The remaining language constructs (share e, release e, lock y e and unlock e) operate on a 
reference e. The first two constructs increment and decrement the cell reference count of e respectively. 
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Configuration C ::= S;T Stack E ::= □ | E[F] 

Store 5 ::= | 5, n— > v Frame F ::= (□ ef | (v n)^ | (□)[>] | let p,x = ref □ in e 

Threads T ::= | T,n:6;e I deref □ | □ := e \ v := □ | share □ | release □ 

Locations e ::= | e,i 

Access Lists ::= | 9,i i-> n;n;e;e 

Reduction relation 



| lock y □ I unlock □ | pop y □ 
| if □ then e\ else e 2 



V = Ax.ey as T\ — j >t 2 fresh n' (0i,0 2 ) = split(0, max(y fl )) V;.0(;) = (0,0) 

(ESN) w_j_j_< (E-T) 



S;T,n:9;E[(v' v) par W S;7>:0 i; £[()],k' :0 2 ;n[(v' v ) seq(min(%)) ] S;T,n:0;Q ~> S;T 

v' = Ax.ei as t' 



(E-A) (E-PP) 



S;T,n:9;E[(v' v f e ^ yb) ] ^> S ; T, n : 0; £[pop yii eOv/x]] S;7\n:0;£[pop y v] ^ S ;T,n:9;E[v] 

fresh no 

= (E-RP) (E-FX) 

S;T,n:9;E[(Ap.f)[i@n l ]] S;7>:0;£[(fix x : T.f v) sec|( ''» , ] 

^S;T,n:6;E[f[i@n 2 /p]] ~~> S ;T, n: 9; E[(f [fix x : T.f/x] v) se ^" } ] 

(E-IT) (E-IF) 



S;T,n:9;E[if true thenei else e 2 ] S ;T,n:9;E[if false thenei else e 2 ] 

S;T,n:9;E[ei] ^> S ;T,n:e;E[e 2 ] 

fresh i@>n S'=S,ihv 9' = 9,i h> 1; 1:0:0 

! : — (E-NG) 

S;T,n:9- E[let p, x = ref v in e 2 ] ~» S';T,n;9'; E[e 2 /p][loc,/x]] 

0(0 >(!,!) ^locked(r) 9(i) >(!,!) tglocked(r) 



S;7>:0;£[loc, := v] 5 [i v]; T,n: 9;E[Q] S ;T,n:9;E[dere£ loci] ^> S;T,n:9;E[S(i)] 

9(i) > (1,0) 0(O = (m,« 2 ) 

0(0 > (1,0) e' = 9+, (i,o) m = i=>n 2 = o = e+, (-1,0) 

(E-SH) ! = — — (E-RL) 



S;7>:0;£[share loc,] S;T,n:9' ;£[()] S ;7>:0;£[release loc,] S;7\n:0';£[()] 

6 = locksetO, l,£[pop y] □]) = 0",it->;ii;O;ei;e 2 

0' = 0", J i-»« 1 ;l;dom(S);e ni > 1 locked(r)ne = 

— (E-LK0) 



S;T,n:9;E[lock Y . loc,] S;T,n:6' ;£[()] 



0(0 > (1,1) 0'=0+, (0,1) 0(0 > (1,1) 0'=0+, (0,-1) 

(E-LKJJ w__^j_< — j (E-UL) 



S ;T,n:8\ E[lock Yl loc,] ~> S;7>:0' ;£[()] S;7>:0;£[unlock loc,] ~~> S;7>:0' ;£[()] 

Figure 4: Operational semantics. 



Similarly, the latter two constructs increment and decrement the lock count of e. As mentioned earlier, 
the runtime system inspects the lock annotation y to determine whether it is safe to lock e. 

3.1 Operational Semantics 

We define a small-step operational semantics for our language in Figure The evaluation relation 
transforms configurations. A configuration C consists of an abstract store S and a thread map T@ A store 
S maps constant locations (i) to values (v). A thread map T associates thread identifiers to expressions 
(i.e., threads) and access lists. An access list 6 maps location identifiers to reference and lock counts. 

3 Due to space limitations, some of the functions and judgements that are used by the operational and (later) the static 
semantics are not formally defined in this paper. Verbal descriptions are given in the Appendix. A full formalization is given in 
the companion technical report (8). 

4 The order of elements in comma-separated lists, e.g., in a store 5 or in a list of threads T, is unimportant; we consider all 
list permutations as equivalent. 
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A frame F is an expression with a hole, represented as □. The hole indicates the position where 
the next reduction step can take place. A thread evaluation context E, is defined as a stack of nested 
frames. Our notion of evaluation context imposes a call-by-value evaluation strategy to our language. 
Subexpressions are evaluated in a left-to-right order. We assume that concurrent reduction events can 
be totally ordered [11]. At each step, a random thread (n) is chosen from the thread list for evaluation. 
Therefore, the evaluation rules are non-deterministic. 

When a parallel function application redex is detected within the evaluation context of a thread, a 
new thread is created (rule ESN). The redex is replaced with a unit value in the currently executed 
thread and a new thread is added to the thread list, with a fresh thread identifier. The calling mode of 
the application term is changed from parallel to sequential. The continuation effect associated with the 
sequential annotation equals the resulting effect of the function being applied (i.e., min(y )). Notice, 
that 9 is divided into two lists 9\ and 62 using the new thread's initial effect max(y a ) as a reference for 
consuming the appropriate number of counts from 9. On the other hand, when evaluation of a thread 
reduces to a unit value, the thread is removed from the thread list (rule E-T). This is successfuly only if 
the thread has previously released all of its resources. 

The rule for sequential function application (E-A ) reduces an application redex to a pop expression, 
which contains the body of the function and is annotated with the same effect as the application term. 
Evaluation propagates through pop expressions (rule E-PP), which are only useful for calculating future 
locksets in rule E-LKO. The rules for evaluating the application of polymorphic functions (E-RP) and 
recursive functions (E-FX) are standard, as well as the rules for evaluating conditionals (E-IT and E-IF). 

The rules for reference allocation, assignment and dereference are straightforward. Rule E-NG 
appends a fresh location 1 (with initial value v) and the dynamic count (1,1) to S and 9 respectively. 
Rules E-AS and E-D require that the location (1) being accessed is both live and accessible and no other 
thread has access to 1. Therefore dangling memory location accesses as well as unsynchronized accesses 
cause the evaluation to get stuck. Furthermore, the rules E-SH, E-RL and E-UL manipulate a cell's 
reference or lock count. They are also straightforward, simply checking that the cell is live and (in the 
case of E-UL) locked. Rule E-RL makes sure that a cell is unlocked before its reference count can be 
decremented to zero. 

The most interesting rule is E-LKO, which applies when the reference being locked (1) is initially 
unlocked. The future lockset (e) is dynamically computed, by inspecting the preceding stack frames (E) 
as well as the lock annotation (71). The lockset e is a list of locations (and thus locks). The reference 
; must be live and no other thread must hold either 1 or any of the locations in e. Upon success, the 
lock count of 1 is incremented by one. On the other hand, rule E-LK1 applies when 1 has already been 
locked by the current thread (that tries to lock it again). This immediately succeeds and the lock count is 
incremented by one. 

3.2 Static Semantics 

We now present our type and effect system and discuss the most interesting parts. Effects are used to 
statically track the capability of each cell. An effect (y) is an ordered list of elements of the form r* and 
summarizes the sequence of operations (e.g., locking or sharing) on references. The syntax of types in 
Figure [3] (on pagel48l) is more or less standard: Atomic types consist of base types (the unit type, denoted 
by (), and bool); reference types ref (r, r) are associated with a type-level cell name r and monomorphic 
function types carry an effect. Figure [5] contains the typing rules. The typing relation is denoted by 
M;A;T\-e: r&(y;y'), where M;A;T is the typing context, e is an expression, r is the type attributed 
to e, y is the input effect, and y' is the output effect. In the typing context, M is a mapping of constant 
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Typing relation M; A;T h e : r&(y;y') 



hM;A;T;y;y hM;A;r;y;y hM;A;T;y;y 

(T-U) — (T-TR) — (T-FL) 



M;A;Th () : ()&(y;y) M; A; T \- true : bool &(y;y) M;A;T h false : bool&(y;y) 

7b 

hM;A;T;y;y hM;A;T;y;y r =t\ — >t 2 M;Aht' T=;r 

(x-.T')er t^t' seq(0) hy/, => M;A;T,x :t\V e\: T 2 &(m\n(y h y,y b ) 



M;A;T h x : T&(y;y) M;A;T h Ax.e\ as r' : r&(y;y) 

M; A,p;T h f : r&(y;y) M;Ahr M;Ahr[r/p] M;A;T h ei : Vp.r&(y;y') 
1^2 ^ lili^ (T-RF) ! : — 1 L w " (T-RP) 

M\A\T h Ap.f : Vp.r&(y;y) M;A;T h ( ei )[r] : T[r/p]&(y;y') 

M;A\Th ei:T X ^T 2 &.{yy,y') £ : vy 1 = y®y ct M\A;T h e : t' Sc{mm{y b );y b ) Jb^l' b 
M;A;Th e 2 :i"i&(y9;y 3 ) f = par=>T 2 = <) seq(y) h y' = y9y^ t'^t hM;A;T;y;y' 
— — — — (T-A) - (T-PP) 



M;A;T\-(ei e 2 ) f : T 2 &(y;y') M;A;T \- pop r e : r &(y;y') 

y* / / y « / / / 

r = Ti — >t 2 t =Tj — >r 2 t^t y a -y fl hM;A;T;y;y 
M;A;T,x : r h/ : r'&(y;y) y b = summary(y f ,) (; ht')6M r^ref(r',i) 



(T-FX) — ' (T-L) 



M; A;T h fix x : r.f : r&(y;y) M;A\T h loc, : r&(y;y) 

M; A;T h ei : tj &(yi \p;y') yi(p) = (l,l) M;Ahr M; A,p;r,x : ref(T 1; p) h e 2 : T&ty.p - ^) 

M;A;T h let p,x = ref <>i in e 2 : r&(y;y') 

M;A;rh ei : ref(T,r)&(y!;y') y(r)> (1,1) 

M;A;n-e 2 :T&(y;yi) y(r)> (1,1) M;A;rhei : ref(T,r)&(y;/) 

M'^'V — . , . „ — ; ^ — z — (1'*-') 



(T-NG) 



M;A;T h e\ := en : <>&(y;y') M; A;T h deref <?i : r&(y;y') 

M;A;1> e : ref^^&Cy,^ 1 - *^') M-A;The:ref(T, r )&(y,r K+( - 1 - 0) -y') 

k>(2,0) y(f) = k K = (n\,n2) n\ = => » 2 = y(r) = k 

(T-SH) — — ■ = — (T-RL) 



M\A\T h share e : <)&(y;y') M;A;T h release e : <>&(y;y') 

M;A;T h e : ref(T,r)&(y,j-*~ ( - ' 1) ;y') M;A;T h e : ref^&ty^+^V) 

k>(\,\) y(r) = K k>(1,0) y{r) = K 

(T-LK) (T-UL) 



M;A;T h lock y e : <>&(y;y') M; A;T h unlock e : <>&(y;y') 

M;A;T h ei : bool &(y,y 2 ?y3;y') max(y :: y 2 ) = max(y :: y 3 ) 

M\A;T h e 2 : r&(y;y :: y 2 ) M; A;T h e 3 : r&(y;y :: y 3 ) 

(T-IF) 

M; A;T h if ei then e 2 else e 3 : r&(y;y') 

Figure 5: Typing rules. 



locations to types, A is a set of cell variables, and T is a mapping of term variables to types. 

Lock operations and sequential application terms are annotated with the continuation effect. This 
imposes the restriction that effects must flow backwards. The input effect y to an expression e is indeed 
the continuation effect; it represents the operations that follow the evaluation of e. On the other hand, 
the output effect y' represents the combined operations of e and its continuation. The typing relation 
guarantees that the input effect is always a prefix of the output effect. 

The typing rules T-U, T-TR, T-FL, T-V, T-L, T-RF and T-RP are almost standard, except for the 
occasional premise t-t' which allows the type system to ignore the identifiers used for location aliasing 
and, for example, treat the types i@n\ and i@ri2 as equal. The typing rule T-F checks that, if the effect 
yb that is annotated in the function's type is well formed, it is indeed the effect of the function's body. On 
the other hand, the typing rule T-A for function application has a lot more work to do. It joins the input 
effect y (i.e., the continuation effect) and the function's effect y u , which contains the entire history of 
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events occurring in the function body; this is performed by the premise £ h y 2 - y©7a> which performs 
all the necessary checks to ensure that all the capabilities required in the function's effect y u are available, 
that pure capabilities are not aliased, and, in the case of parallel application, that no lock capabilities are 
split and that the resulting capability of each location is zero. Rule T-PP works as a bridge between 
the body of a function that is being executed and its calling environment. Rule T-FX uses the function 
summary to summarize the effect of the function's body and to check that the type annotation indeed 
contains the right summary. The effect summary is conservatively computed as the set of locks that are 
acquired within the function body; the unmatched lock/unlock operations are also taken into account. 

Rule T-NG for creating new cells passes the input effect y to e 2 , the body of let, augmented by 
p°'°. This means that, upon termination of e 2 , both references and locks of p must have been consumed. 
The output effect of e 2 is a y\ such that p has capability (1, 1), which implies that when e 2 starts being 
evaluated p is live and locked. The input effect of the cell initializer expression e\ is equal to the output 
effect of e 2 without any occurrences of p. Rules T-AS and T-D check that, before dereferencing or 
assigning to cells, a capability of at least (1, 1) is held. Rules T-SH, T-RL, T-LK and T-UL are the ones 
that modify cell capabilities. In each rule, k is the capability after the operation has been executed. In the 
case of T-RL, if the reference count for a cell is decremented to zero, then all locks must have previously 
been released. The last rule in Figure [5J and probably the least intuitive, is T-IF. Suppose y is the input 
(continuation) effect to a conditional expression. Then y is passed as the input effect to both branches. 
We know that the outputs of both branches will have y as a common prefix; if y 2 and y^ are the suffixes, 
respectively, then y 2 ?y3 is the combined suffix, which is passed as the input effect to the condition e\. 

4 Type Safety 

In this section we present proof sketches for the fundamental theorems that prove type safety of our 
language]! The type safety formulation is based on proving progress, deadlock freedom and preservation 
lemmata. Informally, a program written in our language is safe when for each thread of execution either 
an evaluation step can be performed, or the thread is waiting to acquire a lock {blocked). In addition, 
there must not exist any threads that have reached a deadlocked state. As discussed in Section 13.11 a 
thread may become stuck when it performs an illegal operation, or when it references a location that has 
been deallocated, or when it accesses a location that has not been locked. 

Thread Typing. Let E[e] be the body of a thread and let 6 be the thread's access list. Thread typing is 
defined by the rule: 

M;A;T h e : T&(y a ;y b ) M;A;T\-E: r 7 -^(>&(y i; y 2 ) 

Vr" € y\,K = (0,0) counts_ok(£'[pop ri; □],#) lockset_ok(£[pop r , □],#) 

(EA) 

M;A;Th f 6;E[e] : <)&(yi;y 2 ) 

First of all, thread typing implies the typing of E[e]. 

Secondly, thread typing establishes an exact correspondence between counts of the access list and 
counts of pop expression annotations that reside in the evaluation context E[pop 7b □] (i.e., counts_ok 
(E[pop 7b □],#)). The typing derivations of e and E establish an exact correspondence between the anno- 
tations of pop expressions and static effects. Therefore, for each location i in 9, the dynamic reference 
and lock counts of i are identical to the static counts of i deduced by the type system. 

Thirdly, thread typing enforces the invariant that the future lockset of an acquired lock at any pro- 
gram point is always a subset of the future lockset computed when the lock was initially acquired (i.e., 

5 The complete proofs are given in the companion technical report 1 8 1. 
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lockset_ok(£ , [pop 76 □],#)). This invariant is essential for establishing deadlock freedom. Finally, all 
locations must be deallocated and released when a thread terminates (Vr* e y\.K = (0,0)). 

Process Typing. A collection of threads T is well typed if each thread in T is well typed and thread 
identifiers are distinct: 

M; 0; h, 9; e : <) & (y;y') M h T ni dom(r> 
Mi-0 M\-T,n:9;e 
Store Typing. A store 5 is well typed if there is a one-to-one correspondence between 5 and M and 
all stored values are closed and well typed: 

dom(M) = dom(S) V(z i-> t) e M.M;0;0 h 5(«) : t&(0;0) 

mTs 

Configuration Typing. A configuration 5 ; T is we// fyped when both T and 5 are well typed, and locks 
are acquired by at most one thread (i.e., mutex(r) holds). 

M h T M h S mutex(r) 

M\-S;T 

Deadlocked State. A set of threads no, n^, where k > 0, has reached a deadlocked state, when each 
thread n, has acquired lock £(i+i)mod(k+i) an d is waiting for lock 

Not Stuck. A configuration S ; T is nof stacA; when each thread in T can take one of the evaluation steps 
in Figure |4] or it is trying to acquire a lock which (either itself or its future lockset) is unavailable (i.e., 
blocked(7» holds). 

Given these definitions, we can now present the main results of this paper. Progress, deadlock free- 
dom and preservation are formalized at the program level, i.e., for all concurrently executed threads. 

Lemma 1 (Deadlock Freedom) If the initial configuration takes n steps, where each step is well typed, 
then the resulting configuration has not reached a deadlocked state. 

Proof. Let us assume that z threads have reached a deadlocked state and let m e [0,z - 1], k - (m + 
1) modz and o = (k + 1) modz. According to definition of deadlocked state, thread m acquires lock and 
waits for lock i m , whereas thread k acquires lock i Q and waits for lock i^. Assume that m is the first of the 
z threads that acquires a lock so it acquires lock before thread k acquires lock i a . 

Let us assume that S y ;T y is the configuration once i is acquired by thread k for the first time, e\ y is 
the corresponding lockset of i Q (e\ y - lockset(j„, L^fpopy □])) and ei y is the set of all heap locations 
(ei y = dom(Sy)) at the time i is acquired. Then, i% does not belong to e\ y , otherwise thread k would have 
been blocked at the lock request of i a as is already owned by thread m. 

Let us assume that when thread k attempts to acquire i^, the configuration is of the form S X ,T X . 
According to the assumption of this lemma that all configurations are well typed so S X ;T X is well-typed 
as well. By inversion of the typing derivation of S X ;T X , we obtain the typing derivation of thread : 
^^[lock^ loc,J: lock}/ loc, t is well-typed with input-output effect (y' k ',7^), where k = y' k {ik@n'), 
k > (1,1), y" = y' k ,{ik@n') K ~ {Yfi) , and lockset_ok(£i[pop r » a], 9k) holds, where 9u is the access list of 
thread k. Iockset_ok(£ , *[pop y » □],%) implies lockset(j ,n2,.Efc[pop r " n])nei c e 2 , where 9k = 9' k ,i Q h> 
n\',nz',e\',ei (notice that «2 is positive, ei = e\ y and e\ = ei y — this is immediate by the operational steps 
from S y ;T y to S x ; T x and rule E-LKO). 

We have assumed that m is the first thread to lock i% at some step before S y ; T y , thus ik e dom(5 _ v ) (the 
store can only grow — this is immediate by observing the operational semantics rules). By the definition 
of lockset function and the definition of y' k ' we have that ^ e lockset(i ,?22^/t[pop y ^ □])■ Therefore, 
ik € lockset(j ,7i2>£/t[p°P-/' □]) ndom(5 v ) c e ly , which is a contradiction. □ 
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Lemma 2 (Progress) IfS;T is a well typed configuration, then S ,T is not stuck. 

Proof. It suffices to show that for any thread in T, a step can be performed or block predicate holds for 
it. Let n be an arbitrary thread in T such that T = T\,n:6;e for some T\. By inversion of the typing 
derivation of S;T we have that M;0;0h, 6;e : (>&(y;/), mutex(T), and My S. 

If e is a value then by inversion of M;0;0 \- t 9;e : ()&(y;y')> we obtain that y - y', E[e] = □[()] and 
Vi.0(i) = (0,0), as a consequence of Vr" e y./c = (0,0) and counts_ok(n[pop r n],6). Thus, rule E-T can 
be applied. 

If e is not a value then it can be trivially shown (by induction on the typing derivation of e) that there 
exists a redex u and an evaluation context E such that e = E[u]. By inversion of the thread typing deriva- 
tion for e we obtain that M;0;0 h u : T&(y a ;y b ), M;0;0 h E : r — ■* ()&(y;y'), COuntS_ok(£[pop 76 □],#) 
hold. 

Then, we proceed by perfoming a case analysis on u (we only consider the most interesting cases): 

Case (Ax.e' as t v) par : it suffices to show that (61,62) = sp\\\(6, max(y c )) is defined, where y c is the 
nnotation of type t. If max(y c ) is empty, then the proof is immediate from the base case of split 
function. Otherwise, we must show that for all 1, the count 6(1) is greater than or equal to the 
sum of all (i@n) K in max(y c .). This can be shown by considering par h y, = y a ©y c (i.e., the max 
counts in y c are less than or equal to the max counts in y&), which can be obtained by inversion 
of the typing derivation of (Ax.e' as r v) par , and the exact correspondence between static (y&) and 
dynamic counts (i.e, counts_ok(£'[pop ri □],#)). Thus, rule ESN can be applied to perform a 
single step. 

Case share loc,: counts_ok(£'[pop ri u],6) establishes an exact correspondence between dynamic and 
static counts. The typing derivation implies that y a (i@n\) > (2,0), for some n\ existentially bound 
in the premise of the derivation. Therefore, 6(1) > (1,0). It is possible to perform a single step using 
rule E-SH. The cases for release loc, and unlock loc, can be shown in a similar manner. 

Case lock 7o loc,: similarly to the case we can show that 6(1) = (n\,H2) and n\ is positive. If «2 is 
positive, rule E-LK1 can be applied. Otherwise, nj_ is zero. Let e be equal to locked(ri) n 
lockset(i, l,E[pop 7a □]). If e is empty then rule E-LK0 can be applied in order to perform a single 
step. Otherwise, blocked(r,?i) predicate holds and the configuration is not stuck. 

Case deref loc,: it can be trivially shown (as in the previous case of share that we proved 6(1) > 
(1,0)), that 6(1) > (1,1) and since mutex(ri,n:0;E[deref loc,]) holds, then 1 £ locked(ri) and 
thus rule E-D can be used to perform a step. The case of loc, := v can be shown in a similar 
manner. □ 

Lemma 3 (Preservation) Let S;T be a well-typed configuration with M h S,T. If the operational 
semantics takes a step S;T ~» S';T', then there exists M' 2 M such that the resulting configuration is 
well-typed with M' h S';T'. 

Proof. We proceed by case analysis on the thread evaluation relation (we only consider a few cases due 
to space limitations): 

y c seq(y„) 

Case E-A: Rule E-A implies S' = S, T = T,n :6;E[pop ya e\[v/x]] and e = (Ax.ei as t\ — >T2 v) 

By inversion of the configuration typing assumption we have that mutex(r, n : 6; E[e]) and M; 0; \- t 
6;E[e] : <)&(y;y') hold. It suffices to show that mu\ex(T,n:6;E[pop ya e^v/xfl) and M;0;0 h f 
6;E[pop 7a e\[v/x\] : (>&(y;y') hold. The former is immediate from mutex(T,n:8;E[e]) as no new 
locks are acquired. Now we proceed with the latter, which can be shown by proving that M;0;0 h 
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pop Tn ei[v/x] : T' 2 &(y a ;yb) holds. By inversion on the thread typing derivation E[e] we have 

M; 0; h v : t'j & (y b ;y b ), seq(y a ) h y b = y a ®y' c and M; 0; h Ax.e 1 as n t 2 : t'j -A & (y^y&X 

where Tj — - Ti — >t 2 . We can use proof by induction on the expression typing relation to 
show that if v is well typed with t[, then it is also well typed with t\ provided that n t[. 
Therefore, M;0;0 h v : T\&(yi,;yb) holds. By inversion of the function typing derivation we ob- 
tain that seq(0) h y c => M;0;0,x : ti h e\ : T 2 &(min(y c );y c ). seq(0) h y' c (premise of seq(y fl ) h 
7b - Ja® j'c) and 7c - j' c imply that seq(0) h y c holds, thus M;0;0,x : n h e\ : T 2 &(min(y c );y c ) 
holds. By applying the standard value substitution lemma on the new typing derivation of v we 
obtain that M;0;0 h ei[v/x] : t 2 &(min(y c );y c ) holds. The application of rule T-PP implies that 
M;0;0 h pop ro ei[v/x] : r' 2 &{y a ;yb) holds. 

Case E-LKO, E-LK1, E-UL, E-SH and E-RL: these rules generate side-effects as they modify the 
reference/lock count of location i. We provide a single proof for all cases. Hence, we are as- 
suming here that u (i.e. in E[u]) has one of the following forms: lock ri loc ; , unlock loc, 
share loc, or release loc,. Rules E-LKO, E-LK1, E-UL, E-SH and E-RL imply that S' = S, 
T -T,n: 6' ;£"[()], where () replaces u in context E and 9 differs with respect to 6' only in the one 
of the counts of i (i.e., 9' = 9[i h-> 9{i) + (ni,n 2 )] and yJj)-K = {n\,n 2 ) — y a is the input effect of 
E[u\). 

By inversion of the configuration typing assumption we have that: 

- mutex(r,7i : 9; E[u]): In the case of E-UL, E-SH, E-LK1 and E-RL no new locks are acquired. 
Thus, m\Jtex(T,n:9' ;£[()]) holds. In the case of rule E-LKO, a new lock i is acquired (i.e., 
when the lock count of i is zero) the precondition of E-LKO suggests that no other thread 
holds v. locked(r) n locksetO, l,£[pop 7n □]) = 0. Thus, mutex(:7>: #';£[()]) holds. 

- M;0;0 \- t 9;E[u] : ()&(y;y / ): By inversion we have that M;0;0 h E : ()^<)&(y;y') and 
M;0;0 h u : ()&(y a ;yb), where yt = y a ,{i@n') K for some n'. It can be trivially shown from 
the latter derivation that M;0;0 h () : ()&(y a ;y a ). We can obtain from the typing derivation of 
E (proof by induction) that M;0;0 h E : <>^ 0&(y,y"), where y' - y",(i@n'f. 

- lockset.ok^fpop^ u],9) and counts.ok^fpop^ a], 9): By the definition of lockset func- 
tion it can be shown that lockset(j,nfo,2s[pop ro □]) c lockset(j,nf,,£'[pop yi □]) for all j ^ i 
in the domain of 9' (rib is the lock count of j in 9). The same applies for j = i in the case of 
rules E-SH, E-RL as the lock count of i is not affected. In the case of rules E-LKO, E-LK1, 
E-UL we have \ockse\(i,rib ± l,E[pop 7a □]), but this is identical to lockset(f,n^,£[pop ri □]) 
by the definition of lockset. Therefore lockseLokCEfpop^ □],#') holds. The predicate 
count s-ok (£"[pop y6 □],#) enforces the invariant that the static counts are identical to the 
dynamic counts (9) of i. The lock count of 9 is modified by ± 1 and y a differs with respect to 
yb by (i@n') K . We can use this fact to show that counts_ok(£'[pop y( u\,9'). □ 

Lemma 4 (Multi-step Program Preservation) Let So; To be a closed well-typed configuration for 
some Mq and assume that So;Tq evaluates to S n ;T n in n steps. Then for all ie [0,n] M, h S,;T, holds. 

Proof. Proof by induction on the number of steps n using Lemma[3] □ 

Theorem 1 (Type Safety) Let expression e be the initial program and let the initial typing context Mq 
and the initial program configuration Sq;Tq be defined as follows: Mq = 0, Sq = 0, and Tq = {0 : 0;e}. 
If So; To is well-typed in Mo and the operational semantics takes any number of steps Sq;To S n ;T n , 
then the resulting configuration S n ;T„ is not stuck and T„ has not reached a deadlocked state. 
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Proof. The application of Lemma|4]to the typing derivation of S o; T implies that for all steps from zero 
to n there exists an M l such that M, h Si,T t . Therefore, Lemma Q] implies that -ideadlocked(r„) and 
Lemma [2] implies S n ; T n is not stuck. □ 

Typing the initial configuration Sq;To with the empty typing context Mq guarantees that all functions 
in the program are closed and that no explicit location values (loc,) are used in the original program. 

5 Concluding Remarks 

The main contribution of this work is type-based deadlock avoidance for a language with unstructured 
locking primitives and the meta-theory for the proposed semantics. The type system presented in this 
paper guarantees that well-typed programs will not deadlock at execution time. This is possible by 
statically verifying that program annotations reflect the order of future lock operations and using the 
annotations at execution time to avoid deadlocks. The main advantage over purely static approaches to 
deadlock freedom is that our type system accepts a wider class of programs as it does not enforce a total 
order on lock acquisition. The main disadvantages of our approach is that it imposes an additional run- 
time overhead induced by the future lockset computation and blocking time (i.e., both the requested lock 
and its future lockset must be available). Additionally, in some cases threads may unnecessarily block 
because our type and effect system is conservative. For example, when a thread locks x and executes a 
lengthy computation (without acquiring other locks) before releasing x, it would be safe to allow another 
thread to lock y even if x is in its future lockset. 

We have shown that this is a non-trivial extension for existing type systems based on deadlock avoid- 
ance. There are three significant sources of complexity: (i) lock acquisition and release operations may 
not be properly nested, (ii) lock-unlock pairs may span multiple contexts: function calls that contain lock 
operations may not always increase the size of lockset, but instead limit the lockset size. In addition, 
future locksets must be computed in a context-sensitive manner (stack traversal in our case), and (iii) in 
the presence of location (lock) polymorphism and aliasing, it is very difficult for a static type system even 
to detect the previous two sources of complexity. To address lock aliasing without imposing restrictions 
statically, we defer lockset resolution until run-time. 
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Appendix 

A.l Formalism Summary: Operational Semantics 

locked(r) takes a list of threads T and returns a set of locations locked by threads in T. 

8 +, («i , «2) updates the map 9 so that the reference and lock counts of 9(i) are incremented 

by n\ and «2 respectively. 

6(i) returns the reference and lock counts of 0(f). 

(81,82) = split(#, max(y„)) takes y a (the effect of a new thread) and 9 and returns 9\ and 9i, such that the 

sum of the counts of each location in 9\ and #2 equals the counts of the same 
location in 9. 
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lockset(;,«,£') traverses the evaluation context E and returns the future lockset for ; acquired 

« times, only examining frames of the form pop y □. The traversal ends when 
E is empty or n is zero. 

A.2 Formalism Summary: Static Semantics 

M; A h t well-formedness judgement within a typing context M; A for type t. 

M; A h r well-formedness judgement within a typing context M; A for location r. 

h M; A; F; j\ ; j2 well-formedness judgement for typing context M; A; F and effect (y i ; yi). 

£ I- y ensures that pure capabilities are not aliased within y. In the case of parallel 

application (i.e., £ = par), the ending capability of each location must be zero, 

whereas the starting capability of each location must have a zero lock count 

when that capability is impure. 
y{r) returns the most recent (i.e., rightmost) occurence of r within effect y. 

max(y') returns a subset of y', say y such that no duplicate locations or branches exist, 

the domain of y' equals the domain of y and each element of y is equal to y(r) 

for any r in the domain of y. 

min(y') takes y' and returns a prefix y' of y such that no duplicate locations or branches 

exist and the domain of y' equals the domain of y. 

y\r takes y and r and removes all occurences of r' from y such that r' is identical 

to r modulo the tags of constant locations. 

£ h Y - J © J i takes y, representing the environment effect before a function call, the function 

effect yi and yields the environment effect y' representing the environment 
effect after the function call, y is a prefix of y' and the suffix of y' is an 
adjusted version of y\ : the order of locations is the same as in y\ but the counts 
may be greater than the ones in y\ as some counts may have been abstracted 
withing the scope of the function. It also enforces £ h y. 

k > k' true if both counts of k are no smaller than the correspoding counts of k' . 

k + k' , k- k' calculate the sum and difference of two capabilities (considered here as two- 

dimensional vectors). 

summary(y) used primarily for calculating the summarized effects of recursive functions. 

t t' true when r and t' are structurally equivalent after removing @n annotations 

from locations. 

M; A;F h E : t —* r' &(yi;y2) the evaluation typing context judgement that takes the typing context M; A;T, 

the evaluation context E, the expected effect (y a ',yb) and the expected type t 
(for the innermost hole in E), the input effect y\ and returns the type t' and the 
effect yi that will be returned by E when it is filled with an expression of type 
t and effect (y a ',yb)- 

A3 Formalism Summary: Type Safety 

blocked(r,«) true when thread n of thread list T is in a blocked (i.e, waiting for a lock) state. 

mutex(r) true when each lock is held by at most one thread of T. 

COUnts_ok(£', 9) takes an evaluation context E and an access list 9 and holds when the sum of all 

pop expression annotations in E equal the counts of 9. It establishes an exact 
correspondence between dynamic and static counts. 

lockset_ok(£, 9) takes an evaluation context E and an access list 9 and holds when the future 

lockset (lockset function) of an acquired lock at any program point is always a 
subset of the future lockset computed when the lock was initially acquired. 



